Practical applications of mechanical metamaterials often involve solving inverse problems where the objective is to find the (multiple) microarchitectures that give rise to a given set of properties. The limited resolution of additive manufacturing techniques often requires solving such inverse problems for specific sizes. One should, therefore, find multiple microarchitectural designs that exhibit the desired properties for a specimen with given dimensions. Moreover, the candidate microarchitectures should be resistant to fatigue and fracture, meaning that peak stresses should be minimized as well. Such a multi-objective inverse design problem is formidably difficult to solve but its solution is the key to real-world applications of mechanical metamaterials. Here, we propose a modular approach titled 'Deep-DRAM' that combines four decoupled models, including two deep learning models (DLM), a deep generative model (DGM) based on conditional variational autoencoders (CVAE), and direct finite element (FE) simulations. Deep-DRAM (deep learning for the design of random-network metamaterials) integrates these models into a unified framework capable of finding many solutions to the multi-objective inverse design problem posed here. The integrated framework first introduces the desired elastic properties to the DGM, which returns a set of candidate designs. The candidate designs, together with the target specimen dimensions are then passed to the DLM which predicts their actual elastic properties considering the specimen size. After a filtering step based on the closeness of the actual properties to the desired ones, the last step uses direct FE simulations to identify the designs with the minimum peak stresses.
translated by 谷歌翻译
Automatic differentiation (AD) is a technique for computing the derivative of a function represented by a program. This technique is considered as the de-facto standard for computing the differentiation in many machine learning and optimisation software tools. Despite the practicality of this technique, the performance of the differentiated programs, especially for functional languages and in the presence of vectors, is suboptimal. We present an AD system for a higher-order functional array-processing language. The core functional language underlying this system simultaneously supports both source-to-source forward-mode AD and global optimisations such as loop transformations. In combination, gradient computation with forward-mode AD can be as efficient as reverse mode, and the Jacobian matrices required for numerical algorithms such as Gauss-Newton and Levenberg-Marquardt can be efficiently computed.
translated by 谷歌翻译
A prominent approach to solving combinatorial optimization problems on parallel hardware is Ising machines, i.e., hardware implementations of networks of interacting binary spin variables. Most Ising machines leverage second-order interactions although important classes of optimization problems, such as satisfiability problems, map more seamlessly to Ising networks with higher-order interactions. Here, we demonstrate that higher-order Ising machines can solve satisfiability problems more resource-efficiently in terms of the number of spin variables and their connections when compared to traditional second-order Ising machines. Further, our results show on a benchmark dataset of Boolean \textit{k}-satisfiability problems that higher-order Ising machines implemented with coupled oscillators rapidly find solutions that are better than second-order Ising machines, thus, improving the current state-of-the-art for Ising machines.
translated by 谷歌翻译
Meshing is a critical, but user-intensive process necessary for stable and accurate simulations in computational fluid dynamics (CFD). Mesh generation is often a bottleneck in CFD pipelines. Adaptive meshing techniques allow the mesh to be updated automatically to produce an accurate solution for the problem at hand. Existing classical techniques for adaptive meshing require either additional functionality out of solvers, many training simulations, or both. Current machine learning techniques often require substantial computational cost for training data generation, and are restricted in scope to the training data flow regime. MeshDQN is developed as a general purpose deep reinforcement learning framework to iteratively coarsen meshes while preserving target property calculation. A graph neural network based deep Q network is used to select mesh vertices for removal and solution interpolation is used to bypass expensive simulations at each step in the improvement process. MeshDQN requires a single simulation prior to mesh coarsening, while making no assumptions about flow regime, mesh type, or solver, only requiring the ability to modify meshes directly in a CFD pipeline. MeshDQN successfully improves meshes for two 2D airfoils.
translated by 谷歌翻译
Event Detection (ED) is the task of identifying and classifying trigger words of event mentions in text. Despite considerable research efforts in recent years for English text, the task of ED in other languages has been significantly less explored. Switching to non-English languages, important research questions for ED include how well existing ED models perform on different languages, how challenging ED is in other languages, and how well ED knowledge and annotation can be transferred across languages. To answer those questions, it is crucial to obtain multilingual ED datasets that provide consistent event annotation for multiple languages. There exist some multilingual ED datasets; however, they tend to cover a handful of languages and mainly focus on popular ones. Many languages are not covered in existing multilingual ED datasets. In addition, the current datasets are often small and not accessible to the public. To overcome those shortcomings, we introduce a new large-scale multilingual dataset for ED (called MINION) that consistently annotates events for 8 different languages; 5 of them have not been supported by existing multilingual datasets. We also perform extensive experiments and analysis to demonstrate the challenges and transferability of ED across languages in MINION that in all call for more research effort in this area.
translated by 谷歌翻译
Event Extraction (EE) is one of the fundamental tasks in Information Extraction (IE) that aims to recognize event mentions and their arguments (i.e., participants) from text. Due to its importance, extensive methods and resources have been developed for Event Extraction. However, one limitation of current research for EE involves the under-exploration for non-English languages in which the lack of high-quality multilingual EE datasets for model training and evaluation has been the main hindrance. To address this limitation, we propose a novel Multilingual Event Extraction dataset (MEE) that provides annotation for more than 50K event mentions in 8 typologically different languages. MEE comprehensively annotates data for entity mentions, event triggers and event arguments. We conduct extensive experiments on the proposed dataset to reveal challenges and opportunities for multilingual EE.
translated by 谷歌翻译
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
translated by 谷歌翻译
Mixture of factor analyzer (MFA) model is an efficient model for the analysis of high dimensional data through which the factor-analyzer technique based on the covariance matrices reducing the number of free parameters. The model also provides an important methodology to determine latent groups in data. There are several pieces of research to extend the model based on the asymmetrical and/or with outlier datasets with some known computational limitations that have been examined in frequentist cases. In this paper, an MFA model with a rich and flexible class of skew normal (unrestricted) generalized hyperbolic (called SUNGH) distributions along with a Bayesian structure with several computational benefits have been introduced. The SUNGH family provides considerable flexibility to model skewness in different directions as well as allowing for heavy tailed data. There are several desirable properties in the structure of the SUNGH family, including, an analytically flexible density which leads to easing up the computation applied for the estimation of parameters. Considering factor analysis models, the SUNGH family also allows for skewness and heavy tails for both the error component and factor scores. In the present study, the advantages of using this family of distributions have been discussed and the suitable efficiency of the introduced MFA model using real data examples and simulation has been demonstrated.
translated by 谷歌翻译
Context-sensitive two-point layer 5 pyramidal cells (L5PCs) were discovered as long ago as 1999. However, the potential of this discovery to provide useful neural computation has yet to be demonstrated. Here we show for the first time how a transformative L5PCs-driven deep neural network (DNN), termed the multisensory cooperative computing (MCC) architecture, can effectively process large amounts of heterogeneous real-world audio-visual (AV) data, using far less energy compared to best available 'point' neuron-driven DNNs. A novel highly-distributed parallel implementation on a Xilinx UltraScale+ MPSoC device estimates energy savings up to 245759 $ \times $ 50000 $\mu$J (i.e., 62% less than the baseline model in a semi-supervised learning setup) where a single synapse consumes $8e^{-5}\mu$J. In a supervised learning setup, the energy-saving can potentially reach up to 1250x less (per feedforward transmission) than the baseline model. The significantly reduced neural activity in MCC leads to inherently fast learning and resilience against sudden neural damage. This remarkable performance in pilot experiments demonstrates the embodied neuromorphic intelligence of our proposed cooperative L5PC that receives input from diverse neighbouring neurons as context to amplify the transmission of most salient and relevant information for onward transmission, from overwhelmingly large multimodal information utilised at the early stages of on-chip training. Our proposed approach opens new cross-disciplinary avenues for future on-chip DNN training implementations and posits a radical shift in current neuromorphic computing paradigms.
translated by 谷歌翻译
由于免费的在线百科全书具有大量内容,因此Wikipedia和Wikidata是许多自然语言处理(NLP)任务的关键,例如信息检索,知识基础构建,机器翻译,文本分类和文本摘要。在本文中,我们介绍了Wikides,这是一个新颖的数据集,用于为文本摘要问题提供Wikipedia文章的简短描述。该数据集由6987个主题上的80K英语样本组成。我们设置了一种两阶段的摘要方法 - 描述生成(I阶段)和候选排名(II阶段)作为一种依赖于转移和对比学习的强大方法。对于描述生成,与其他小规模的预训练模型相比,T5和BART表现出了优越性。通过将对比度学习与Beam Search的不同输入一起应用,基于度量的排名模型优于直接描述生成模型,在主题独立拆分和独立于主题的独立拆分中,最高可达22个胭脂。此外,第II期中的结果描述得到了人类评估的支持,其中45.33%以上,而I阶段的23.66%则支持针对黄金描述。在情感分析方面,生成的描述无法有效地从段落中捕获所有情感极性,同时从黄金描述中更好地完成此任务。自动产生的新描述减少了人类为创建它们的努力,并丰富了基于Wikidata的知识图。我们的论文对Wikipedia和Wikidata产生了实际影响,因为有成千上万的描述。最后,我们预计Wikides将成为从短段落中捕获显着信息的相关作品的有用数据集。策划的数据集可公开可用:https://github.com/declare-lab/wikides。
translated by 谷歌翻译